{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Loader Agents <a id=\"make-a-data-cleaning-agent\"></a>\n",
    "\n",
    "Most of the time, we need to load data from different sources and formats. This is a very common task in data science projects. In this notebook, we will create a data loader agent that can load data from different sources and formats. This agent specializes in loading:\n",
    "\n",
    "- CSV files\n",
    "- Excel files\n",
    "- Parquet files\n",
    "- Pickle files\n",
    "- And more...\n",
    "\n",
    "### Want To Become A Full-Stack Generative AI Data Scientist?\n",
    "\n",
    "![Generative AI Data Scientist](../img/become_a_generative_ai_data_scientist.jpg)\n",
    "\n",
    "I teach Generative AI Data Science to help you build AI-powered data science apps. [**Register for my next Generative AI for Data Scientists workshop here.**](https://learn.business-science.io/ai-register)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "bat"
    }
   },
   "source": [
    "# Table of Contents\n",
    "\n",
    "1. [Load Libraries](#load-libraries)\n",
    "2. [Setup AI](#setup)\n",
    "3. [Create The Agent](#create-the-agent)\n",
    "4. [Usage](#usage)\n",
    "    1. [Example 1: What tools do you have access to? Return a table.](#example-1-what-tools-do-you-have-access-to-return-a-table)\n",
    "    2. [Example 2: What folders and files are available?](#example-2-what-folders-and-files-are-available)\n",
    "    3. [Example 3: What is in the data folder?](#example-3-what-is-in-the-data-folder)\n",
    "    4. [Example 4: Let's load the bike sales data from the CSV file.](#example-4-lets-load-the-bike-sales-data-from-the-csv-file)\n",
    "    5. [Example 5: What folders and files are available in my Documents directory?](#example-5-what-folders-and-files-are-available-in-my-documents-directory)\n",
    "    6. [Example 6: Search for 'csv' files recursively in my current working directory.](#example-6-search-for-csv-files-recursively-in-my-current-working-directory)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Load Libraries <a id=\"load-libraries\"></a>"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# * Libraries\n",
    "\n",
    "from langchain_openai import ChatOpenAI\n",
    "import pandas as pd\n",
    "import os\n",
    "import yaml\n",
    "\n",
    "from ai_data_science_team.agents import DataLoaderToolsAgent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setup AI <a id=\"setup\"></a>\n",
    "\n",
    "This section of code sets up the LLM inputs and the logging information. Logging is used to store AI-generated code and files during the AI Data Science Teams processing of files. \n",
    "\n",
    "*Important Note:* This example uses OpenAI's API. But any LLM can be used such as Anthropic or local LLMs with Ollama."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x7f9a611da530>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x7f9a611d8340>, root_client=<openai.OpenAI object at 0x7f9a29813b20>, root_async_client=<openai.AsyncOpenAI object at 0x7f9a611da560>, model_name='gpt-4o-mini', model_kwargs={}, openai_api_key=SecretStr('**********'))"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# * Setup\n",
    "\n",
    "os.environ[\"OPENAI_API_KEY\"] = yaml.safe_load(open('../credentials.yml'))['openai']\n",
    "\n",
    "llm = ChatOpenAI(model=\"gpt-4o-mini\")\n",
    "llm\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create The Agent <a id=\"create-the-agent\"></a>\n",
    "\n",
    "Run this code to create an agent with `make_data_cleaning_agent()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAALUAAADqCAIAAABr6ZPcAAAAAXNSR0IArs4c6QAAHVBJREFUeJztnXlgE9UCr89MkjbN0mZr03RfsRQKRYtWqCwCylLZi1KqggJyL3BFQa8svsu9CpeHqFwVcGEHlUUQKEtboEBL2ZciO3Rf0iVtmqZZmmQy8/4YXwMkDQUT5qSc7692JnPyy+TLmTkz55zBKIoCCEQ74EwHQEAN8gPhDOQHwhnID4QzkB8IZyA/EM5gMx3g0dCozC1NhEFrNeqsZhPJdJwOwfHCWWzA82XzhKyAEG+2lyf9JjGPuP5RW2YsuaovvaYXyb0srSTPlyXw47A5GNO5OoQXF29utBi0hKHF2qA0ycO4UQn8Ls8JuTwW09EeDux+NCpNp/Y3+ghZ4gCvyO58idyL6UR/lco7hpKrelVla3AM78VUKdNxHgLUfhTsayi/ZeiTKo2I5zOdxfVcOKI+c0A9JEP+zHNCprO0C6R+kCS17YvKF4ZJonsImM7iRiiKOrmnAcOxlFEyprM4BkY/rAT1/cfFb3wUKlV4M53lSXD5WFNLE9FvrD/TQRwAnR+EhfxxfsnfV8QwHeSJculYU01J64h3FUwHeRDo/NiypPy16QqRv8efhz4q53PUVoJKHg7XGStcbfG83aqXxsieQjkAAL1fkRBmsuSqjukg9wGRHzWlxvpKU6dsqnSQxAHiE7tUTKe4D4j8KNjX2HckXLXrE0YgYkclCP7I1zAdxAYsfpTd0PuHeCkifZgOwjB9RkpLrumZTmEDFj+Kr+j8Q7hP7O2uXbtmMpmY2twJHA6OAVBx2+COwh8DWPwova6P7PaEzjwyMzMnT55sNBoZ2fyhRHbnl0JThUDhR02ZMSSW5yN4QverHvunT18LcFPN0UZUAl9da3brW3QcKPxoVllYbLfcjC0vL58xY0ZKSsrw4cOXLl1KkmRmZuayZcsAAIMHD05KSsrMzAQAFBYWzpo1KyUlJSUl5b333rt58ya9uUajSUpK2rJly6JFi1JSUqZNm+Zwc9ciEHFqSo1WAorrUlD0/9BrrXxft1Qen332WVlZ2dy5c/V6/YULF3Ac79u3b0ZGxtatW1euXCkQCMLCwgAASqXSZDJNnToVx/GdO3f+4x//yMzM5HL/PB9at25dWlra999/z2Kx5HK5/eYuh+/L1msJXwnHHYU/EnD40Uz4ydyyL5RKZVxc3JgxYwAAGRkZAACJRBISEgIA6N69u0gkol82bNiw4cOH03/Hx8fPmDGjsLAwOTmZXpKQkDBz5sy2Mu03dznIj/vAcOCmzj7Dhw/fuHHj8uXLp06dKpFI2g2AYceOHdu6dWtpaSmPxwMANDY2tq19/vnn3ZHNCd48nIKjcxwU5x9cHquliXBHyTNnzvzwww9zcnJGjhy5Y8eO9l62du3ajz76KD4+/quvvpozZw4AgCRt34+Pz5O+KqNRWXhCKHqXQeEHXZ26o2QMw9LT0/fu3du/f//ly5cXFha2rWq7MWkymTZs2DB69Oi5c+cmJiYmJCR0pGS33tfUawm+LxRVOxR+CKVsnOWW4wvdFuXz+TNmzAAA3Lp1q60+UKn+vNNhNBpNJlPXrl3pfzUazQP1xwM8sLnLaTVaFZFcjjcUXw0UkobG8jJ/UPYb4+/yVu4///lPgUCQnJx88uRJAAAtQc+ePVks1ooVK0aOHGkymcaNGxcTE7Nt2zapVKrT6X788Uccx4uKitor035z12YuvaqHpPIAALAWL17MdAYAAGioNuFsTBLo4jv7VVVVJ0+ezMrKMhqNs2fPHjBgAADA19dXLpcfPnw4Pz9fq9WmpqY+++yzBQUFO3bsKC8vnz17dnh4+K5duyZNmmSxWDZv3pySkhIfH99Wpv3mrs18IUcd3VPg8l3xeMDSP+ju5RZVtalPKqTdMJ8ku7+rGvleEJuDji/3ENtLePpAY7dkv/YuhKhUqrS0NPvlFEVRFIXjDvbm+++/T1/5cCtTp051eDCSy+V1dXX2y1NTU+fNm9deaRcOqxWRPpDIAVH9AQAouqK7e6ll2BTHfTCtVqvD3U2SJEmSbLYD0f38/Ph8t9/zU6lUFovFfrnFYuFwHLjO4/Hau7BGWqk1HxfP/BKivrcQ+QEAyNla22ug2D/4qei2bs/FI2pvHt69j7suyz4GsNRjNK9kBG5fUQmVsk+MO5daGpRmqOSAzg8AwMSPQ39ZVsF0iieNssRw4XDTq28FMh3kQeA6vtDomok9a6ozPglnOsgTouKW4cIR9dhZIUwHcQCMftCXQ7atqJz4cecfQvfHSU3pNf2oGcFMB3EMpH7QZG+uBQD0eU0qFDN/p9vllF7Tn9rfEN1TkDwM3l77UPtBn7Wdymzs+rxQHs7tHENj9Fqi9Lq+6o6BsFB9UmWQXCdtD9j9oLl1QXv3kq78lqFHih+GA74vWyBie8pEPCw2ptNYDFqrvplQVZta1ERkN35cb6EiygMGc3iGHzQUSZXd1DerCL2WMOqsJqOLu9AYjcaSkpJu3bq5tliBH9tKUDxfFt+P7R/iFRjuAVq04Ul+uJvi4uL58+c76Ub0FOIZVTSCKZAfCGcgP2xgGBYREcF0CrhAftigKKqsrIzpFHCB/LgPgaAzT4f3GCA/7kOng2v6HsZBftjAMMzfH8ZJBBkE+WGDoij3jVrwUJAfNnAcj46OZjoFXCA/bJAkWVxczHQKuEB+IJyB/LCBYZj7pmzwUJAfNiiKogffItpAfthA9Yc9yA8bqP6wB/mBcAbywwaO4/TcYog2kB82SJKsqqpiOgVcID8QzkB+2MBxPDIykukUcIH8sEGSZGlpKdMp4AL5gXAG8sMGun9rD/LDBrp/aw/yA+EM5IcNNL7BHuSHDTS+wR7kB8IZyI/7QONfHgD5cR9o/MsDID9s4DgeGhrKdAq4QH7YIEmysrKS6RRwgfxAOAP5YQPDMKkU3qkEGQH5YYOiqHsfS4hAftwHhmFRUVFMp4AL5IcNiqJKSkqYTgEXyA8bqP6wB/lhA9Uf9iA/bGAYJpfLmU4BF2h+XDBx4kSDwUBRlMVi0Wq1MpmMoiiTyZSdnc10NOZB9Qd47bXXamtrlUqlSqUymUzV1dVKpdLX15fpXFCA/AATJ058YNgcjuN9+/ZlLhFEID8AhmHjx49nsVhtS8LCwhw+S/UpBPkB6CokOPjPJzhhGPbSSy+1/fuUg/z4k4yMDG9vbwBASEjIuHHjmI4DC8iPPxk9enRwcDBFUX369EGj+Nt4+PPVLSayscZs0FmfSB4mGf3Ke1lZWQNfeKPkmp7pLO4Fx4CfjCOWP/zRZg+5/pG3W1VUqOP7sX0EDzcJ4SnwRSxlkZHvy+rxkigm0VmXW2ff+qENNWIFN20uGtLeOSFJKvdXJQVAbPuKtFt/HP65TiT3juuN5mvr5ORsrn5usCiiq+Nngzo+P62rbG01kkiOp4E+IwOunGhub61jP9Q1ZjYHNW2eCgQijrLYSJgdPwzUsQR6LSGSQf3cXoQLCYz00TRYHK5y7AdpBVbiab+v+/Rg0BIYhjlchQ4iCGcgPxDOQH4gnIH8QDgD+YFwBvID4QzkB8IZyA+EM5AfCGcgPxDOQH4gnOEuPz5fuuityQ/v5VtbW1NTq/wrb3T8xJGBg5IqKlw2b+mBg3sGDkpqbGxwVYFPjL++M+1hsv6oVlalZ4y8ffsGgxk6DW7amUz6YSWIzjH61yWfgqKoauXjP93MTTvTlb2Oc4/lbNr8Y11dTUR4FEna+pscytq3Z8+OktIiHx/e871fnDVznkgkrqlVvj1lPADg3//55N8AvPpq6icfLzabzZu3/JSbm12vqpNKZa8MGTH57ffuHdnWEXJyDvz86walskoqlY0YPmZS+hQcx52XfLfo9rfffXH79g2pRBYaGn5vaXv3/bZj59aGhvrAwKBBLw99fcKb3t7ezc2a0WMHz3jv/btFtwsKjsfGxn2zcm17eerr69ZtWH32bIFerwsNDU+fOGXwoKH0qhs3r61a/WVJyV2pRBYRGV1UdHvzxt1eXl6tra1r1606mptlNptCQ8InTHjz5YGvAAB+2/VL7rGctPGT1q1b1ahuiI2Nm/fhorCwCIc78xG/Pce4zI8jR7OWLF3UKzFpQlpGba3yl183Bgf/OZfojRtXw8IihgwZ3tSk3v37Nr1B/98lK6US2cIFny9ZumjK5Bm9EpPEYgkAgMViXbx49sU+/YIUIUVFt7f+vF4o9J2QltHxGNnZ+5ctXzxo0NB33/n7jRtX129YAwB4M+NdJyVXVJR98OF0P1/RtKmzWCz25i0/tZW2cdOPO3/bOnbMG+HhUZWVZdt3bK6qrljwyX/otVu3rhs1Ku3LFd87N5iwErduXR81cryfryjvZO6SpYuCg0O7xnWrq6ud99HfYmPjFs7//Oy5gv0Hfp82dZaXlxdJkgsXfVBbq5yUPkUkkhQWXvjs8wWtrcbhw0YBAG7evLZjx5a5cxcRBPHVV0v++3//tWbVJoc70yW4xg+TyfTdqhU9evT6YvkqemdVV1cWFd+h1374wYK27idsNnvrz+tNJpO3t3eX2DgAQFhYREJCIr2WxWKtXrWp7cXKmqq8/NyO+0FR1Nr1qxISEhct+BwA0O+ll1tatNu2bxo3diKPx2uv5O9//B+O4au+2ygSienB2Sv/twwA0NCg+vmX9YsWLunfbxC9lVTq//XK/86aOY/+Nz4+Yeq7Mx+aKkgRvHH9Tvqthw0bNWbc4IKC413juh0+ctBoNP7r02USibRv3/5X/rh05uzJ9ImT8/Jz/7h6+defM2UyfwDA4EFDjUbDrt2/0n4AAJZ8/rVEIgUAjB37xuo1Xzdrm/18/ex3pktwjR9XrxU2N2vGj0tv+yXh9/ykLBbL7t+3HT5ysL6+1tubS5KkRtMklwc6LKqpSb15y0/nL5xpadECAIQCYcdjVFVVNDSoXp/wZtuS3r1fPHhob1V1RZfYOIclt7a2nj9/euTI8bQctMH0HxcvniUIYsnSRUuWLqKX0Af4BlW9VCoDADz77PMdDFZUfGfjph/ok0er1apWNwIAVKo6Pp9Pf9MYhgUFhdTV1QAAzpw5SRBEesbIts2tViufbxuCwOX60H/I5QoAQGODys/Xr+N76ZFwjR/19bUAgMDAIPtVFEUtWDjn9p0bb781PT6+R35+7rbtm0nKcW9Ytbpx+oxJPj68d6b8LSgoZP361ZVV5R2PodPrAAAika12FQp96W9UJvV3WHKjuoEgCIWj5I3qBgDA0iUrA/zvm1QoKChEr9fd+z0559Ll8//8ZHavxKSPP/oXn8f/P4s/oj9+cHCoXq8vKSmKioqxWCxFRbcTE5MAAE1NjVKp7KsV399bCIvt4JvisDkAACvpxqGNrvFD5CcGAGg0Tfarrly5dPHSuYULPqdPyqqrKpyUsy9zV1OTetW3G+naJSAg8JH8oL/I5mZN25KmJjVtSXsl08nplz0A7RZdaXc8gz1btqwNCgpZumQlXTP5/H+rXn0ldedvPy9YNOeVISMKr1wkCGLyW9Pp99VomuRyBT1enFlc076Nju6C4/iRo4fsVzVrNQAA+ujY9i/duvH25tLVY9uLtVqNSCRuO/Q0azUPbbN5cbwAAFptMwBAKpUFyhXnzhW0rT1x4giXy42Jeaa9kvl8fnBw6PETRyyWBztw9+rVG8Ow3/dsb1tiNBoffd+AZq0mJroLLYfZbDYYDfTH9/MTzZo5z9ubW1panPRc8k8//BISEkYftqxW677M3x7pfe13pktgLV7soCFUXWy0EiAwokP1J/3YFJWqLisrs7y8xGg0nD1bkJ2T6ePDGzP6dT5PsHffzrq6Gh6Pn5efu2XrWovF0isxKSwsgs/nHz588Or1Qh6Pf/Hi2S6xXa1W66FD+0jSarZYtm3bdCLvqF6vHz0qjcvltvfWbA7n9z3bb92+HhYWoQgMEgp8t+/cqlLV0Sc9R44empT+Tu+kZJPZ1F7JQqHfwUN7z54tIAjizp2bO3/7WattnpCWEShXtLS05OQcuHP3pslkOnO2YOmyT3v16i2Vykym1m3bNycnp8Q9E//QnVNeUXbixBGxWFJXV7vym2XV1ZUYAKmpY2/dvvGvxR9NfWdmVHSsSCS2Wq0yWQCO4xER0ecvnMnO2d+s1TQ1qbOy93/73fLUEWPZbPaNm1fPnz89KX0Kh8Ohz7eO5ma/9to4qURmvzPZjg5JDrl9oTkmUcATOmiFucYPAMBzz72g1+sKTp04f/4UhmFCoa/RaBwz+nU+nx8REZWVnZmVnUkQxMIFnzc01F+7Vvjqq6kYhsXH9zh3/lTuseyaWmVK34HduvWgKHLP3p35eUeDgkPnzf306tXLRqOBPjA7RCgQKgKDLl0+j2N476TkmJguYrEk91jOoax9miZ1evqUjEnvYBgWHh7ZXsnRUbF+fqJLl86dLDjeoKqP7RJXXHxnQloGj8fr3ftFHo9/+nR+7rHsquqKvn3693mxn4+PzyP50S2+Z3l5ye7ftxVeuTCg/5Cxo1/PPZYdGxsXGhJ2+kz+/gO/5+UdPZF3NCs78/Tp/CFDRnh7ew/oP0Sn0x4/fjgvP1dv0A0bOiohIRHHcSd+2O9MQYdP7Z344Xj87blstbkV9BzgsmY0wiFWq5Vu8Vmt1vyTx/79n0++XLHm2V69n3CMfWsqhr4dKFU4GBHnAbM26HS6iZNSHa56b/r7qSPGPPFED/J4CSsqyt7/YNqLyS/FRHcxmU15eUe5XG5IcJibwz4aHuAHj8f78YdfHK7yFbqr3f9IPF5CPl8w6OWhZ87kHz5yUCAQJnRPnDNnfkAAXBP0ouMLwtnxBfUPQjgD+YFwBvID4QzkB8IZyA+EM5AfCGcgPxDOQH4gnIH8QDgD+YFwhuP7L1wei7Q67gKI6Hz4Sjl4Oz3wHdcffjJ2Tdnj9JVCeBwWM1ldZBAHOJ7u1rEfIbE8s7HzP9ADAQCoLTU8k9RuTyLHfrDY2AtDJTmbq90ZDME8LU3m05mqgWkB7b3A2fNfqouN2ZtrE/tLRHJvntADeoogOgiGA3WtSaexXC/QTJofxvFqt5nykOcD6TTEpdym2rJWQ0vnP9xQFGUxm70gGFXgbkRyLwyA0C4+z74sdv5K9PxsG8XFxfPnz9+xYwfTQSACXf9AOAP5gXAG8sMGjuPR0dFMp4AL5IcNkiSLi4uZTgEXyA8bOI6HhoYynQIukB82SJKsrKxkOgVcID9s4DgeGYke9nsfyA8bJEmWlpYynQIukB82cBwPCQlhOgVcID9skCRZVfX4M5B2SpAfCGcgP2xgGBYeHt6BFz5FID9sUBRVXv4I0+E9DSA/EM5AftjAMAyGKSWhAvlhg6Iok8nEdAq4QH7cB5/PZzoCXCA/7kOv1zMdAS6QHwhnID/uIyCg3Z7+TyfIj/uor69nOgJcID8QzkB+2ED9x+xBfthA/cfsQX4gnIH8sIHGN9iD/LCBxjfYg/xAOAP5YQO1X+xBfthA7Rd7kB/3IRQ+wuOYnwaQH/fR0tLCdAS4QH4gnIH8sIFhWETEX3pUducD+WGDoqiysjKmU8AF8sMGGp9tD/LDBhqfbQ/ywwaqP+xBfthA9Yc9yA8bqP6wB82PC6ZPn240GjEM0+v1tbW1UVFRGIYZjcadO3cyHY150KzqoHv37lu2bGn7ndy8eRMAoFAomM4FBej4At58880Hpg2iKCoxMZG5RBCB/ABisXjYsGH3LlEoFOnp6cwlggjkBwAATJgwoa3nB0VRPXv27Nq1K9OhoAD5AQAAIpFo6NCh9N8KhWLSpElMJ4IF5MefpKWlhYWFURTVo0eP+Ph4puPAgse3X0iSMmitf72RzsF8hwwcmZWVNWHs2y1NxF8PxvbCfPjtPBXSc/DI6x/VRcbSa/rGOnNdeau5lfQP5enUZqZD3QeFAauFJEwkV8BSRPkowr0ju/NF/o6fEQkzHubH2Sz1rXMtuBeLL+bxJD5sLxbbC97fKEVShNlqaSV0jfoWlcE/hNs9WRCVIGA61yPgMX5cPak5ubfRP9JPHOrHYnvkaZNJb24sa8KBtf94mSLCh+k4HcID/KBIsOs7JWBzJGEinOWRZtyLQdOqb2iJiOP2HiJiOsvDgd0PK0Ft+qw8IEYqkPGYzuJK6u82SGRgcLqc6SAPAWo/SJLa9mWVLDrAy8fj21n2NJSqQyJZycMkTAdxBtTV9dalFdJIWaeUAwAgi5RUl1vPHGxkOogz4PXj4IZaSbjIm+95bcKOIw2XlN0yFV+Bd9ANpH4UXdFpmyiBzJOago+HPC7g0KY6plO0C6R+5O9pkIQ/5NHfnQMMwxRdxAX7GpgO4hgY/bh+WsOX+HjxOEwHeUJIw0U3z7e0Gq1MB3EAjH4U5mmFATCOk25orJz36QuX/8hxeclCf/6N01qXF/vXgc4PfTNh0Fp9fJ+u5ygIZPyiKzBO7Q2dHyXXdMLOdSmsI/DFXHWtyQTfIQa6SwuqKrO3H9dNhReVXDx4eLWy9o5QIImJTBo25G++Qlm18vZ3a6e9++bXB3NWK2vviEWKEa/M6t61H72JTt+09+DX12/lcdje0ZHPuSkYAEAUyKstaw3vCtcDJKCrP5rqLRz33JK9W3z+p83/kAdEThi9sF+f9JKyy99vmGk2twIALBbT1u0L+/V542/vrBGLAn/Z+alerwEAWAjzDxtnX795ol+f9BGvzlI3Kd0RjIaiML0W1R8Pw6C18uVu8WPPgS+Tk8aMSZ1H/9sl5oUvvnn9dtEZiUgBABg9Ym5iwhAAwPAhf1+55u3isss9ug0sOLOzpvbu9Le/7RLzPAAgIjRh+TevuyMbAADnsAxaF/RLci3Q+cEVsNnervdD3VRTpyptUFeeubDn3uWa5jraDy/OnzfcxSIFAEDbogIAXLt5QiGPoeUAAOC4G/uaePlwCIJ0X/mPB3R+tOoJwmR1ea+fFl0jAGDIwKk94gfeu1wolKnV1fcuYbM4AACStAIANM21wYpnXJukPcwGC5sNXV8n6Pzg+7IsJoIrdPFtFx+ukD7PCPB/hBmCBHyxTt/k2iTtQRJWvh90rXrozk/Fci+rxfXVrL8sTOQXeP5SpslspJdYrQRBWJxvFax4prL6Rr3qSTwUF8MB3xe6nyt0fsjDvAwao8uLxTBs1PAPtC0N3/7wbsHZ3/JPb//mh3dPnfvN+VYDX3oLw/DV62fk5m26cPnA7v1fuDxYG+oqfVCUuxr2jw10fkR2E7TUG9xRckL8gHcyvmKxOPsOfn3k+HqxODAqopfzTWTSkGlv/U/kG5Cd+9Ph4+uD5LHuCAYA0DUa/UO4bC/ovg4Y+49tW1HpGyzhiaD7MbmPujuNzyRyEvtDd8saugMeACBxgN/lPK0TPw4fX3ei4Bf75SGKuKqaWw43mT1trTzAZXO/HDy8+tS5XfbLfbhCY6vjzj5zZmyUSdud3F1Vrk2bDeOjRWCsPwAAW5ZW+Mf4t9eKMRi0rSad/XIMa/fj+PkGsFgu+zHoDc0mk4PbaRQFMMzxJk4CqErU4TGs51+FsSMqpH6U3dSf2t8c1B327t1/HStBFp+unL40iukgjoHuhIgmoivfP4ilqYaxS4RrUV6rGzY5kOkU7QKpHwCAIZPkxiadQdPKdBA3Une3oVsyP7QLvP0ZID2+tLFjZbVALuqUbZmaWw3xvbk9U/yYDuIMeOsPmglzgpsqGrW18I4AeDxqbtSHRbMgl8MD6g+a7M11Gg0Qh4g6wVgpbb3BpNH1fEnQ5VkY+9g+gGf4AQC4c6klf08jX+IjCfPz8vHIru0GTauqRO0rZg0YLxMHeMa4L4/xg+aP/Oarp7QmIymQ8vhSHtuLxfZmQTvdA2GyWswE0Uq0qAzaekN4PD+xn68i0jNmdqDxMD9oGmtMpdf19ZXm+kpTq47w8/duboRr/iAMA5ZW0tuH5SNkySO4IdHcqAS+tw903Tseikf68QAWE0WScH0KDAMcLwzD27mY6jl0Bj8Q7gPSIzcCEpAfCGcgPxDOQH4gnIH8QDgD+YFwxv8Di0DfHnxxNfQAAAAASUVORK5CYII=",
      "text/plain": [
       "<ai_data_science_team.agents.data_loader_tools_agent.DataLoaderToolsAgent object at 0x7f9a2982cd90>"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Make a data loader agent\n",
    "data_loader_agent = DataLoaderToolsAgent(\n",
    "    llm, \n",
    "    invoke_react_agent_kwargs={\"recursion_limit\": 10},\n",
    ")\n",
    "\n",
    "data_loader_agent"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Usage\n",
    "\n",
    "Here are several examples of how to use the agent:\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 1: What tools do you have access to? Return a table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "Here is a table of the tools I have access to:\n",
       "\n",
       "| Tool Name                  | Description                                                                                          |\n",
       "|----------------------------|------------------------------------------------------------------------------------------------------|\n",
       "| `load_directory`           | Loads all recognized tabular files in a directory.                                                 |\n",
       "| `load_file`                | Automatically loads a file based on its extension.                                                 |\n",
       "| `list_directory_contents`   | Lists all files and folders in the specified directory.                                            |\n",
       "| `list_directory_recursive`  | Recursively lists all files and folders within the specified directory.                            |\n",
       "| `get_file_info`           | Retrieves metadata (size, modification time, etc.) about a file.                                   |\n",
       "| `search_files_by_pattern`  | Searches for files that match a given wildcard pattern, optionally in subdirectories.              |\n",
       "\n",
       "If you have any specific tasks you would like to perform using these tools, feel free to ask!"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# What tools do you have access to? Return a table.\n",
    "data_loader_agent.invoke_agent(\"What tools do you have access to? Return a table.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 2: What folders and files are available?\n",
    "\n",
    "**Important note about searching directories:** The agent will by default look for files and folders in your current working directory. However you can change this behavior by prompting the agent to look in a different folder (e.g. Documents, Downloads, etc.)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: list_directory_contents | /Users/mdancho/Desktop/course_code/ai-data-science-team\n",
      "    * Tool: list_directory_recursive | /Users/mdancho/Desktop/course_code/ai-data-science-team\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "Here is the folder and file structure at the root of your directory:\n",
       "\n",
       "```\n",
       "ai-data-science-team/\n",
       "  - LICENSE\n",
       "  - MANIFEST.in\n",
       "  - README.md\n",
       "  - ai_data_science_team/\n",
       "  - ai_data_science_team.egg-info/\n",
       "  - apps/\n",
       "  - build/\n",
       "  - data/\n",
       "  - dist/\n",
       "  - examples/\n",
       "  - h2o_models/\n",
       "  - img/\n",
       "  - logs/\n",
       "  - mlruns/\n",
       "  - requirements.txt\n",
       "  - setup.py\n",
       "  - temp/\n",
       "```\n",
       "\n",
       "This structure shows the top-level folders and files within the `ai-data-science-team` directory."
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# What folders and files are available?\n",
    "data_loader_agent.invoke_agent(\"What folders and files are available at the root of my directory? Return the file folder structure as code formatted block with the root path at the top and just the top-level folders and files.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Many of the tools return artifacts. We can access them as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>type</th>\n",
       "      <th>name</th>\n",
       "      <th>parent_path</th>\n",
       "      <th>absolute_path</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>directory</td>\n",
       "      <td>ai-data-science-team</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>file</td>\n",
       "      <td>LICENSE</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>file</td>\n",
       "      <td>MANIFEST.in</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>file</td>\n",
       "      <td>README.md</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>directory</td>\n",
       "      <td>ai_data_science_team</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>527</th>\n",
       "      <td>file</td>\n",
       "      <td>app.py</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>528</th>\n",
       "      <td>file</td>\n",
       "      <td>data_gen.py</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>529</th>\n",
       "      <td>file</td>\n",
       "      <td>pypi.md</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>530</th>\n",
       "      <td>file</td>\n",
       "      <td>test.py</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>531</th>\n",
       "      <td>file</td>\n",
       "      <td>workflow.drawio</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "      <td>/Users/mdancho/Desktop/course_code/ai-data-sci...</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>532 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "          type                  name  \\\n",
       "0    directory  ai-data-science-team   \n",
       "1         file               LICENSE   \n",
       "2         file           MANIFEST.in   \n",
       "3         file             README.md   \n",
       "4    directory  ai_data_science_team   \n",
       "..         ...                   ...   \n",
       "527       file                app.py   \n",
       "528       file           data_gen.py   \n",
       "529       file               pypi.md   \n",
       "530       file               test.py   \n",
       "531       file       workflow.drawio   \n",
       "\n",
       "                                           parent_path  \\\n",
       "0                   /Users/mdancho/Desktop/course_code   \n",
       "1    /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "2    /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "3    /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "4    /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "..                                                 ...   \n",
       "527  /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "528  /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "529  /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "530  /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "531  /Users/mdancho/Desktop/course_code/ai-data-sci...   \n",
       "\n",
       "                                         absolute_path  \n",
       "0    /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "1    /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "2    /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "3    /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "4    /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "..                                                 ...  \n",
       "527  /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "528  /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "529  /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "530  /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "531  /Users/mdancho/Desktop/course_code/ai-data-sci...  \n",
       "\n",
       "[532 rows x 4 columns]"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.get_artifacts(as_dataframe=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 3: What is in the data folder?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: list_directory_contents | /Users/mdancho/Desktop/course_code/ai-data-science-team/data\n",
      "    * Tool: list_directory_contents | /Users/mdancho/Desktop/course_code/ai-data-science-team/data\n",
      "    * Tool: list_directory_contents | /Users/mdancho/Desktop/course_code/ai-data-science-team\n",
      "    * Tool: list_directory_recursive | /Users/mdancho/Desktop/course_code/ai-data-science-team\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "The \"data\" folder contains the following files:\n",
       "\n",
       "1. `bike_sales_data.csv`\n",
       "2. `churn_data.csv`\n",
       "3. `dirty_dataset.csv`\n",
       "4. `northwind.db` \n",
       "\n",
       "If you need further details about any of these files, let me know!"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# What is in the data folder?\n",
    "data_loader_agent.invoke_agent(\"What is in the data folder?\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 4: Let's load the bike sales data from the CSV file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: load_file | data/bike_sales_data.csv\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "The bike sales data has been successfully loaded from the file `data/bike_sales_data.csv`. If you need any specific information or analysis from this data, please let me know!"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Load the bike_sales_data.csv file from the data folder.\n",
    "data_loader_agent.invoke_agent(\"Load the bike_sales_data.csv file from the data folder.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To extract the data from the artifact, we can use the following code:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>date</th>\n",
       "      <th>bike_model</th>\n",
       "      <th>price</th>\n",
       "      <th>quantity_sold</th>\n",
       "      <th>extended_sales</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>2021-01-01</td>\n",
       "      <td>Commuter Swift</td>\n",
       "      <td>495</td>\n",
       "      <td>23</td>\n",
       "      <td>11385</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>2021-01-01</td>\n",
       "      <td>Urban Rider</td>\n",
       "      <td>350</td>\n",
       "      <td>19</td>\n",
       "      <td>6650</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2021-01-01</td>\n",
       "      <td>City Cruiser</td>\n",
       "      <td>400</td>\n",
       "      <td>19</td>\n",
       "      <td>7600</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>2021-01-01</td>\n",
       "      <td>Mountain Trail Pro</td>\n",
       "      <td>1250</td>\n",
       "      <td>11</td>\n",
       "      <td>13750</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>2021-01-01</td>\n",
       "      <td>Gravel Explorer</td>\n",
       "      <td>2200</td>\n",
       "      <td>9</td>\n",
       "      <td>19800</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13144</th>\n",
       "      <td>2024-12-31</td>\n",
       "      <td>Gravel Explorer</td>\n",
       "      <td>2200</td>\n",
       "      <td>22</td>\n",
       "      <td>48400</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13145</th>\n",
       "      <td>2024-12-31</td>\n",
       "      <td>Roadster GT</td>\n",
       "      <td>2900</td>\n",
       "      <td>11</td>\n",
       "      <td>31900</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13146</th>\n",
       "      <td>2024-12-31</td>\n",
       "      <td>Carbon Storm</td>\n",
       "      <td>5000</td>\n",
       "      <td>5</td>\n",
       "      <td>25000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13147</th>\n",
       "      <td>2024-12-31</td>\n",
       "      <td>Titanium Falcon</td>\n",
       "      <td>7500</td>\n",
       "      <td>6</td>\n",
       "      <td>45000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>13148</th>\n",
       "      <td>2024-12-31</td>\n",
       "      <td>Speedster Elite</td>\n",
       "      <td>9900</td>\n",
       "      <td>7</td>\n",
       "      <td>69300</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>13149 rows × 5 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "             date          bike_model  price  quantity_sold  extended_sales\n",
       "0      2021-01-01      Commuter Swift    495             23           11385\n",
       "1      2021-01-01         Urban Rider    350             19            6650\n",
       "2      2021-01-01        City Cruiser    400             19            7600\n",
       "3      2021-01-01  Mountain Trail Pro   1250             11           13750\n",
       "4      2021-01-01     Gravel Explorer   2200              9           19800\n",
       "...           ...                 ...    ...            ...             ...\n",
       "13144  2024-12-31     Gravel Explorer   2200             22           48400\n",
       "13145  2024-12-31         Roadster GT   2900             11           31900\n",
       "13146  2024-12-31        Carbon Storm   5000              5           25000\n",
       "13147  2024-12-31     Titanium Falcon   7500              6           45000\n",
       "13148  2024-12-31     Speedster Elite   9900              7           69300\n",
       "\n",
       "[13149 rows x 5 columns]"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.get_artifacts(as_dataframe=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 5: What folders and files are available in my Documents directory?\n",
    "\n",
    "Now we'll switch things up and look at a directory that is outside of my current working directory. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: list_directory_contents | /Users/mdancho/Desktop/learning_labs\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "The following folders and files are available in your `Desktop/learning_labs` directory:\n",
       "\n",
       "### Folders\n",
       "1. lab_88_price_optimization_ml_r\n",
       "2. lab_53_modeltime_gluonts_saturn_cloud\n",
       "3. lab_52_stacks\n",
       "4. lab_49_feature_engineering\n",
       "5. lab_19_network_analysis\n",
       "6. lab_40_docker\n",
       "7. lab_84_chatgpt_langchain\n",
       "8. lab_85_cashflow_forecasting\n",
       "9. finance_machine_learning\n",
       "10. lab_87_price_optimization_python\n",
       "11. lab_12_r_programming_rlang\n",
       "12. plumber_api\n",
       "13. lab_72_nlp_r\n",
       "14. lab_39_shiny_app_bonus\n",
       "15. lab_76_geospatial_r\n",
       "16. lab_89_causal_inference_in_r\n",
       "17. lab_37_python_nlp\n",
       "18. lab_93_bayesian_mmm\n",
       "19. time_series_data_visualizations\n",
       "20. ML_Time_Series_Classification_Trading_Strategy\n",
       "21. lab_26_machine_learning_customer_journey\n",
       "22. lab_16_promo\n",
       "23. nowcasting_data\n",
       "24. lab_48_nlp_textrecipes\n",
       "25. covid_forecasting\n",
       "26. lab_44_shiny_powerpoint_automation\n",
       "27. lab_65_sparklyr\n",
       "28. lab_39_mlflow\n",
       "29. lab_58_cust_lifetime_r\n",
       "30. lab_29_quandl_energy\n",
       "31. lab_73_timeseries\n",
       "32. lab_83_chatgpt_2\n",
       "33. lab_14_promo\n",
       "34. lab_18_time_series_anomaly_detection\n",
       "35. lab_69_risk_analysis_py\n",
       "36. lab_66_pyspark\n",
       "37. lab_22_sql_advanced - llpro 2\n",
       "38. lab_63_nested_modeltime\n",
       "39. intro_to_parsnip\n",
       "40. lab_21_SQL_for_datascience\n",
       "41. lab_80_shiny_py\n",
       "42. h2o_workshop\n",
       "43. lab_35_finance_deep_learning\n",
       "44. lab_33_python_HR\n",
       "45. lab_60_modeltime_ecosystem\n",
       "46. webinar_introducing_pytimetk\n",
       "47. lab_56_targets_churn\n",
       "48. lab_31_shiny_google_analytics\n",
       "49. lab_64_sktime\n",
       "50. lab_15_promo\n",
       "51. lab_79_shiny_p2\n",
       "52. lab_38_modeltime\n",
       "53. lab_11_market_basket_analysis\n",
       "54. causal_inference_r_workshop\n",
       "55. lab_94_sales_forecast_python\n",
       "56. lab_76_geospatial_in_r\n",
       "57. lab_90_causal_inference_python\n",
       "58. lab_74_bayesian\n",
       "59. lab_45_shiny_golem\n",
       "60. lab_78_shiny_part1\n",
       "61. lab_17_anomaly_detection_h2o\n",
       "62. lab_16_r_optimization_modeling_2\n",
       "63. lab_27_google_trends\n",
       "64. lab_92_customer_lifetime_value_r\n",
       "65. lab_61_mmm_robyn\n",
       "66. lab_68_risk_analysis\n",
       "67. lab_34_python_ecommerce\n",
       "68. lab_41_metaflow\n",
       "69. lab_96_macroeconomic_forecasts\n",
       "70. lab_13_mortgage_loans_datatable\n",
       "71. lab_47_modeltime_recursive\n",
       "72. lab_09_finance_with_r_tidyquant\n",
       "73. lab_36_tensorflow_energy\n",
       "74. lab_57_targets_modeltime\n",
       "75. lab_79_shiny_ui_editor\n",
       "76. lab_71_nlp_py_survey\n",
       "77. lab_81_modeltime_prefect\n",
       "78. lab_23_bigquery\n",
       "79. lab_25_marketing_attribution\n",
       "80. lab_45_prep\n",
       "81. lab_14_customer_churn_survival_h2o\n",
       "82. lab_55_workflowsets\n",
       "83. lab_15_r_optimization_modeling_1\n",
       "84. lab_05_bike_sharing\n",
       "85. lab_10_plumber_api\n",
       "86. lab_22_sql_advanced\n",
       "87. lab_43_tidyverse_functions\n",
       "88. lab_51_torch_tabnet\n",
       "89. lab_24_ab_testing_infer\n",
       "90. lab_62_mmm_python\n",
       "91. lab_59_cust_lifetime_py\n",
       "92. lab_44_package_setup\n",
       "93. lab_54_modeltime_recursive\n",
       "94. lab_46_forecasting_at_scale\n",
       "95. lab_50_lightgbm\n",
       "96. lab_30_shiny_tidyquant\n",
       "97. lab_91_customer_lifetime_value_python_2\n",
       "98. lab_75_bayesian_2\n",
       "99. lab_77_geospatial_networks\n",
       "100. lab_42_plumber_api\n",
       "101. lab_20_explainable_machine_learning\n",
       "102. lab_95_time_series_py_polars_mlforecast\n",
       "103. H2O_automl_lab\n",
       "104. lab_67_spark_modeltime\n",
       "105. lab_32_twitter_tidytext_shiny\n",
       "106. lab_82_chatgpt_1\n",
       "107. lab_28_api_zillow\n",
       "108. lab_86_customer_segmentation_python\n",
       "109. big_cashflow_forecasting_project\n",
       "110. webscraping_cannondale\n",
       "111. lab_22_sql_advanced - llpro\n",
       "\n",
       "### Files\n",
       "1. my_twitter_token.rds\n",
       "2. hr.churn.test\n",
       "3. config.yml\n",
       "4. Historical Product Demand.csv\n",
       "5. 01_real_estate_analysis.R\n",
       "6. 02_bonus_nowcasting_with_modeltime.R\n",
       "7. metaflow_aws_config.json\n",
       "8. learning_labs.Rproj\n",
       "9. excel_to_r\n",
       "10. rsconnect\n",
       "11. tidymodels_hyperparam_tuning_workflows\n",
       "12. nowcasting_data\n",
       "13. covid_forecasting\n",
       "\n",
       "### Summary\n",
       "- Total folders: 110\n",
       "- Total files: 13"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# What folders and files are available in my Documents directory?\n",
    "data_loader_agent.invoke_agent(\"What folders and files are available in my Desktop/learning_labs directory? Do not return recursive results.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can also get the artifacts from the agent:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>filename</th>\n",
       "      <th>type</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>lab_88_price_optimization_ml_r</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>lab_53_modeltime_gluonts_saturn_cloud</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>lab_52_stacks</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>lab_49_feature_engineering</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>lab_19_network_analysis</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>125</th>\n",
       "      <td>lab_28_api_zillow</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>126</th>\n",
       "      <td>lab_86_customer_segmentation_python</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>127</th>\n",
       "      <td>big_cashflow_forecasting_project</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>128</th>\n",
       "      <td>webscraping_cannondale</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>129</th>\n",
       "      <td>lab_22_sql_advanced - llpro</td>\n",
       "      <td>directory</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>130 rows × 2 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  filename       type\n",
       "0           lab_88_price_optimization_ml_r  directory\n",
       "1    lab_53_modeltime_gluonts_saturn_cloud  directory\n",
       "2                            lab_52_stacks  directory\n",
       "3               lab_49_feature_engineering  directory\n",
       "4                  lab_19_network_analysis  directory\n",
       "..                                     ...        ...\n",
       "125                      lab_28_api_zillow  directory\n",
       "126    lab_86_customer_segmentation_python  directory\n",
       "127       big_cashflow_forecasting_project  directory\n",
       "128                 webscraping_cannondale  directory\n",
       "129            lab_22_sql_advanced - llpro  directory\n",
       "\n",
       "[130 rows x 2 columns]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.get_artifacts(as_dataframe=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Example 6: Search for 'csv' files recursively in my current working directory."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: search_files_by_pattern | /Users/mdancho/Desktop/course_code/ai-data-science-team\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "I found 3 CSV files in your current working directory:\n",
       "\n",
       "1. **bike_sales_data.csv**\n",
       "   - Path: `/Users/mdancho/Desktop/course_code/ai-data-science-team/data/bike_sales_data.csv`\n",
       "\n",
       "2. **churn_data.csv**\n",
       "   - Path: `/Users/mdancho/Desktop/course_code/ai-data-science-team/data/churn_data.csv`\n",
       "\n",
       "3. **dirty_dataset.csv**\n",
       "   - Path: `/Users/mdancho/Desktop/course_code/ai-data-science-team/data/dirty_dataset.csv`"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Search for 'csv' files recursively in my current working directory. \n",
    "data_loader_agent.invoke_agent(\"Search for 'csv' files recursively in my current working directory.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can then load the data from one of the CSV files. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: load_file | /Users/mdancho/Desktop/course_code/ai-data-science-team/data/churn_data.csv\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>customerID</th>\n",
       "      <th>gender</th>\n",
       "      <th>SeniorCitizen</th>\n",
       "      <th>Partner</th>\n",
       "      <th>Dependents</th>\n",
       "      <th>tenure</th>\n",
       "      <th>PhoneService</th>\n",
       "      <th>MultipleLines</th>\n",
       "      <th>InternetService</th>\n",
       "      <th>OnlineSecurity</th>\n",
       "      <th>...</th>\n",
       "      <th>DeviceProtection</th>\n",
       "      <th>TechSupport</th>\n",
       "      <th>StreamingTV</th>\n",
       "      <th>StreamingMovies</th>\n",
       "      <th>Contract</th>\n",
       "      <th>PaperlessBilling</th>\n",
       "      <th>PaymentMethod</th>\n",
       "      <th>MonthlyCharges</th>\n",
       "      <th>TotalCharges</th>\n",
       "      <th>Churn</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>7590-VHVEG</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>1</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>29.85</td>\n",
       "      <td>29.85</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5575-GNVDE</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>34</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>One year</td>\n",
       "      <td>No</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>56.95</td>\n",
       "      <td>1889.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3668-QPYBK</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>2</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>53.85</td>\n",
       "      <td>108.15</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>7795-CFOCW</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>45</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>One year</td>\n",
       "      <td>No</td>\n",
       "      <td>Bank transfer (automatic)</td>\n",
       "      <td>42.30</td>\n",
       "      <td>1840.75</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9237-HQITU</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>2</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>70.70</td>\n",
       "      <td>151.65</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7038</th>\n",
       "      <td>6840-RESVB</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>24</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>One year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>84.80</td>\n",
       "      <td>1990.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7039</th>\n",
       "      <td>2234-XADUH</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>72</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>One year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Credit card (automatic)</td>\n",
       "      <td>103.20</td>\n",
       "      <td>7362.9</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7040</th>\n",
       "      <td>4801-JZAZL</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>11</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>29.60</td>\n",
       "      <td>346.45</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7041</th>\n",
       "      <td>8361-LTMKD</td>\n",
       "      <td>Male</td>\n",
       "      <td>1</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>4</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>74.40</td>\n",
       "      <td>306.6</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7042</th>\n",
       "      <td>3186-AJIEK</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>66</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Two year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Bank transfer (automatic)</td>\n",
       "      <td>105.65</td>\n",
       "      <td>6844.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>7043 rows × 21 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      customerID  gender  SeniorCitizen Partner Dependents  tenure  \\\n",
       "0     7590-VHVEG  Female              0     Yes         No       1   \n",
       "1     5575-GNVDE    Male              0      No         No      34   \n",
       "2     3668-QPYBK    Male              0      No         No       2   \n",
       "3     7795-CFOCW    Male              0      No         No      45   \n",
       "4     9237-HQITU  Female              0      No         No       2   \n",
       "...          ...     ...            ...     ...        ...     ...   \n",
       "7038  6840-RESVB    Male              0     Yes        Yes      24   \n",
       "7039  2234-XADUH  Female              0     Yes        Yes      72   \n",
       "7040  4801-JZAZL  Female              0     Yes        Yes      11   \n",
       "7041  8361-LTMKD    Male              1     Yes         No       4   \n",
       "7042  3186-AJIEK    Male              0      No         No      66   \n",
       "\n",
       "     PhoneService     MultipleLines InternetService OnlineSecurity  ...  \\\n",
       "0              No  No phone service             DSL             No  ...   \n",
       "1             Yes                No             DSL            Yes  ...   \n",
       "2             Yes                No             DSL            Yes  ...   \n",
       "3              No  No phone service             DSL            Yes  ...   \n",
       "4             Yes                No     Fiber optic             No  ...   \n",
       "...           ...               ...             ...            ...  ...   \n",
       "7038          Yes               Yes             DSL            Yes  ...   \n",
       "7039          Yes               Yes     Fiber optic             No  ...   \n",
       "7040           No  No phone service             DSL            Yes  ...   \n",
       "7041          Yes               Yes     Fiber optic             No  ...   \n",
       "7042          Yes                No     Fiber optic            Yes  ...   \n",
       "\n",
       "     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \\\n",
       "0                  No          No          No              No  Month-to-month   \n",
       "1                 Yes          No          No              No        One year   \n",
       "2                  No          No          No              No  Month-to-month   \n",
       "3                 Yes         Yes          No              No        One year   \n",
       "4                  No          No          No              No  Month-to-month   \n",
       "...               ...         ...         ...             ...             ...   \n",
       "7038              Yes         Yes         Yes             Yes        One year   \n",
       "7039              Yes          No         Yes             Yes        One year   \n",
       "7040               No          No          No              No  Month-to-month   \n",
       "7041               No          No          No              No  Month-to-month   \n",
       "7042              Yes         Yes         Yes             Yes        Two year   \n",
       "\n",
       "     PaperlessBilling              PaymentMethod MonthlyCharges  TotalCharges  \\\n",
       "0                 Yes           Electronic check          29.85         29.85   \n",
       "1                  No               Mailed check          56.95        1889.5   \n",
       "2                 Yes               Mailed check          53.85        108.15   \n",
       "3                  No  Bank transfer (automatic)          42.30       1840.75   \n",
       "4                 Yes           Electronic check          70.70        151.65   \n",
       "...               ...                        ...            ...           ...   \n",
       "7038              Yes               Mailed check          84.80        1990.5   \n",
       "7039              Yes    Credit card (automatic)         103.20        7362.9   \n",
       "7040              Yes           Electronic check          29.60        346.45   \n",
       "7041              Yes               Mailed check          74.40         306.6   \n",
       "7042              Yes  Bank transfer (automatic)         105.65        6844.5   \n",
       "\n",
       "     Churn  \n",
       "0       No  \n",
       "1       No  \n",
       "2      Yes  \n",
       "3       No  \n",
       "4      Yes  \n",
       "...    ...  \n",
       "7038    No  \n",
       "7039    No  \n",
       "7040    No  \n",
       "7041   Yes  \n",
       "7042    No  \n",
       "\n",
       "[7043 rows x 21 columns]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.invoke_agent(\"Load the churn data at path /Users/mdancho/Desktop/course_code/ai-data-science-team/data/churn_data.csv\")\n",
    "\n",
    "data_loader_agent.get_artifacts(as_dataframe=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Or we can load all of the datasets from the CSV files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---DATA LOADER TOOLS AGENT----\n",
      "    \n",
      "    * RUN REACT TOOL-CALLING AGENT\n",
      "    * Tool: load_directory | /Users/mdancho/Desktop/course_code/ai-data-science-team/data\n",
      "    * POST-PROCESS RESULTS\n"
     ]
    },
    {
     "data": {
      "text/markdown": [
       "The following CSV files have been successfully loaded from the specified directory:\n",
       "\n",
       "1. **bike_sales_data.csv**\n",
       "2. **churn_data.csv**\n",
       "3. **dirty_dataset.csv**"
      ],
      "text/plain": [
       "<IPython.core.display.Markdown object>"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.invoke_agent(\"Load all csv files in /Users/mdancho/Desktop/course_code/ai-data-science-team/data/.\")\n",
    "\n",
    "data_loader_agent.get_ai_message(markdown=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's extract one of the datasets that were loaded:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "dict_keys(['bike_sales_data.csv', 'churn_data.csv', 'dirty_dataset.csv'])"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "data_loader_agent.get_artifacts().keys()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>customerID</th>\n",
       "      <th>gender</th>\n",
       "      <th>SeniorCitizen</th>\n",
       "      <th>Partner</th>\n",
       "      <th>Dependents</th>\n",
       "      <th>tenure</th>\n",
       "      <th>PhoneService</th>\n",
       "      <th>MultipleLines</th>\n",
       "      <th>InternetService</th>\n",
       "      <th>OnlineSecurity</th>\n",
       "      <th>...</th>\n",
       "      <th>DeviceProtection</th>\n",
       "      <th>TechSupport</th>\n",
       "      <th>StreamingTV</th>\n",
       "      <th>StreamingMovies</th>\n",
       "      <th>Contract</th>\n",
       "      <th>PaperlessBilling</th>\n",
       "      <th>PaymentMethod</th>\n",
       "      <th>MonthlyCharges</th>\n",
       "      <th>TotalCharges</th>\n",
       "      <th>Churn</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>7590-VHVEG</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>1</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>29.85</td>\n",
       "      <td>29.85</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5575-GNVDE</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>34</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>One year</td>\n",
       "      <td>No</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>56.95</td>\n",
       "      <td>1889.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>3668-QPYBK</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>2</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>53.85</td>\n",
       "      <td>108.15</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>7795-CFOCW</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>45</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>One year</td>\n",
       "      <td>No</td>\n",
       "      <td>Bank transfer (automatic)</td>\n",
       "      <td>42.30</td>\n",
       "      <td>1840.75</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>9237-HQITU</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>2</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>70.70</td>\n",
       "      <td>151.65</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7038</th>\n",
       "      <td>6840-RESVB</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>24</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>One year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>84.80</td>\n",
       "      <td>1990.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7039</th>\n",
       "      <td>2234-XADUH</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>72</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>One year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Credit card (automatic)</td>\n",
       "      <td>103.20</td>\n",
       "      <td>7362.9</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7040</th>\n",
       "      <td>4801-JZAZL</td>\n",
       "      <td>Female</td>\n",
       "      <td>0</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>11</td>\n",
       "      <td>No</td>\n",
       "      <td>No phone service</td>\n",
       "      <td>DSL</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Electronic check</td>\n",
       "      <td>29.60</td>\n",
       "      <td>346.45</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7041</th>\n",
       "      <td>8361-LTMKD</td>\n",
       "      <td>Male</td>\n",
       "      <td>1</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>4</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>No</td>\n",
       "      <td>...</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>Month-to-month</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Mailed check</td>\n",
       "      <td>74.40</td>\n",
       "      <td>306.6</td>\n",
       "      <td>Yes</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7042</th>\n",
       "      <td>3186-AJIEK</td>\n",
       "      <td>Male</td>\n",
       "      <td>0</td>\n",
       "      <td>No</td>\n",
       "      <td>No</td>\n",
       "      <td>66</td>\n",
       "      <td>Yes</td>\n",
       "      <td>No</td>\n",
       "      <td>Fiber optic</td>\n",
       "      <td>Yes</td>\n",
       "      <td>...</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Two year</td>\n",
       "      <td>Yes</td>\n",
       "      <td>Bank transfer (automatic)</td>\n",
       "      <td>105.65</td>\n",
       "      <td>6844.5</td>\n",
       "      <td>No</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>7043 rows × 21 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      customerID  gender  SeniorCitizen Partner Dependents  tenure  \\\n",
       "0     7590-VHVEG  Female              0     Yes         No       1   \n",
       "1     5575-GNVDE    Male              0      No         No      34   \n",
       "2     3668-QPYBK    Male              0      No         No       2   \n",
       "3     7795-CFOCW    Male              0      No         No      45   \n",
       "4     9237-HQITU  Female              0      No         No       2   \n",
       "...          ...     ...            ...     ...        ...     ...   \n",
       "7038  6840-RESVB    Male              0     Yes        Yes      24   \n",
       "7039  2234-XADUH  Female              0     Yes        Yes      72   \n",
       "7040  4801-JZAZL  Female              0     Yes        Yes      11   \n",
       "7041  8361-LTMKD    Male              1     Yes         No       4   \n",
       "7042  3186-AJIEK    Male              0      No         No      66   \n",
       "\n",
       "     PhoneService     MultipleLines InternetService OnlineSecurity  ...  \\\n",
       "0              No  No phone service             DSL             No  ...   \n",
       "1             Yes                No             DSL            Yes  ...   \n",
       "2             Yes                No             DSL            Yes  ...   \n",
       "3              No  No phone service             DSL            Yes  ...   \n",
       "4             Yes                No     Fiber optic             No  ...   \n",
       "...           ...               ...             ...            ...  ...   \n",
       "7038          Yes               Yes             DSL            Yes  ...   \n",
       "7039          Yes               Yes     Fiber optic             No  ...   \n",
       "7040           No  No phone service             DSL            Yes  ...   \n",
       "7041          Yes               Yes     Fiber optic             No  ...   \n",
       "7042          Yes                No     Fiber optic            Yes  ...   \n",
       "\n",
       "     DeviceProtection TechSupport StreamingTV StreamingMovies        Contract  \\\n",
       "0                  No          No          No              No  Month-to-month   \n",
       "1                 Yes          No          No              No        One year   \n",
       "2                  No          No          No              No  Month-to-month   \n",
       "3                 Yes         Yes          No              No        One year   \n",
       "4                  No          No          No              No  Month-to-month   \n",
       "...               ...         ...         ...             ...             ...   \n",
       "7038              Yes         Yes         Yes             Yes        One year   \n",
       "7039              Yes          No         Yes             Yes        One year   \n",
       "7040               No          No          No              No  Month-to-month   \n",
       "7041               No          No          No              No  Month-to-month   \n",
       "7042              Yes         Yes         Yes             Yes        Two year   \n",
       "\n",
       "     PaperlessBilling              PaymentMethod MonthlyCharges  TotalCharges  \\\n",
       "0                 Yes           Electronic check          29.85         29.85   \n",
       "1                  No               Mailed check          56.95        1889.5   \n",
       "2                 Yes               Mailed check          53.85        108.15   \n",
       "3                  No  Bank transfer (automatic)          42.30       1840.75   \n",
       "4                 Yes           Electronic check          70.70        151.65   \n",
       "...               ...                        ...            ...           ...   \n",
       "7038              Yes               Mailed check          84.80        1990.5   \n",
       "7039              Yes    Credit card (automatic)         103.20        7362.9   \n",
       "7040              Yes           Electronic check          29.60        346.45   \n",
       "7041              Yes               Mailed check          74.40         306.6   \n",
       "7042              Yes  Bank transfer (automatic)         105.65        6844.5   \n",
       "\n",
       "     Churn  \n",
       "0       No  \n",
       "1       No  \n",
       "2      Yes  \n",
       "3       No  \n",
       "4      Yes  \n",
       "...    ...  \n",
       "7038    No  \n",
       "7039    No  \n",
       "7040    No  \n",
       "7041   Yes  \n",
       "7042    No  \n",
       "\n",
       "[7043 rows x 21 columns]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(data_loader_agent.get_artifacts()['churn_data.csv'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Want To Become A Full-Stack Generative AI Data Scientist?\n",
    "\n",
    "![Generative AI Data Scientist](../img/become_a_generative_ai_data_scientist.jpg)\n",
    "\n",
    "I teach Generative AI Data Science to help you build AI-powered data science apps. [**Register for my next Generative AI for Data Scientists workshop here.**](https://learn.business-science.io/ai-register)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "ds4b_301p_dev",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
